Stochastic boosting algorithms
نویسندگان
چکیده
In this article, we discuss a class of stochastic boosting algorithms, which corrects and develops the work of [23], showing how to perform statistical inference in a computationally efficient manner. Sequential Monte Carlo (SMC) methods are used to illustrate that the stochastic boosting methods can provide better predictions, for a higher computational cost, than the corresponding boosting algorithm. A theoretical result is also given, which expresses an upper-bound of the posterior-predictive test error, in terms of that of boosting. The result shows that the averaged predictions used, are relatively stable with respect to boosting, when the latter provides the single best prediction. We also investigate the method on a real case study from machine learning and in a regression context, showing that it can be a useful tool for data exploration.
منابع مشابه
ada: An R Package for Stochastic Boosting
Boosting is an iterative algorithm that combines simple classification rules with ‘mediocre’ performance in terms of misclassification error rate to produce a highly accurate classification rule. Stochastic gradient boosting provides an enhancement which incorporates a random mechanism at each boosting step showing an improvement in performance and speed in generating the ensemble. ada is an R ...
متن کاملOn Adaptive Regularization Methods in Boosting
Boosting algorithms build models on dictionaries of learners constructed from the data, where a coefficient in this model relates to the contribution of a particular learner relative to the other learners in the dictionary. Regularization for these models is currently implemented by iteratively applying a simple local tolerance parameter, which scales each coefficient towards zero. Stochastic e...
متن کاملMulti-class Image Classification Based on Fast Stochastic Gradient Boosting
Nowadays, image classification is one of the hottest and most difficult research domains. It involves two aspects of problem. One is image feature representation and coding, the other is the usage of classifier. For better accuracy and running efficiency of high dimension characteristics circumstance in image classification, this paper proposes a novel framework for multi-class image classifica...
متن کاملEvaluation of Data Mining Algorithms for Detection of Liver Disease
Background and Aim: The liver, as one of the largest internal organs in the body, is responsible for many vital functions including purifying and purifying blood, regulating the body's hormones, preserving glucose, and the body. Therefore, disruptions in the functioning of these problems will sometimes be irreparable. Early prediction of these diseases will help their early and effective treatm...
متن کاملStochastic Regularization and Boosting for Bioinformatics
Boosting techniques are extremely popular in many domains where one wishes to learn effective classification functions and identify the most relevant variables at the same time. However, in general, these techniques do not perform well on bioinformatic data sets. One known reason that makes bioinformatics data sets particularly problematic for Boosting, is that they typically contain very few t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Statistics and Computing
دوره 21 شماره
صفحات -
تاریخ انتشار 2011